20 research outputs found

    An ontology-based approach for modelling and querying Alzheimer’s disease data

    Get PDF
    Background The recent advances in biotechnology and computer science have led to an ever-increasing availability of public biomedical data distributed in large databases worldwide. However, these data collections are far from being "standardized" so to be harmonized or even integrated, making it impossible to fully exploit the latest machine learning technologies for the analysis of data themselves. Hence, facing this huge flow of biomedical data is a challenging task for researchers and clinicians due to their complexity and high heterogeneity. This is the case of neurodegenerative diseases and the Alzheimer's Disease (AD) in whose context specialized data collections such as the one by the Alzheimer's Disease Neuroimaging Initiative (ADNI) are maintained.Methods Ontologies are controlled vocabularies that allow the semantics of data and their relationships in a given domain to be represented. They are often exploited to aid knowledge and data management in healthcare research. Computational Ontologies are the result of the combination of data management systems and traditional ontologies. Our approach is i) to define a computational ontology representing a logic-based formal conceptual model of the ADNI data collection and ii) to provide a means for populating the ontology with the actual data in the Alzheimer Disease Neuroimaging Initiative (ADNI). These two components make it possible to semantically query the ADNI database in order to support data extraction in a more intuitive manner.Results We developed: i) a detailed computational ontology for clinical multimodal datasets from the ADNI repository in order to simplify the access to these data; ii) a means for populating this ontology with the actual ADNI data. Such computational ontology immediately makes it possible to facilitate complex queries to the ADNI files, obtaining new diagnostic knowledge about Alzheimer's disease.Conclusions The proposed ontology will improve the access to the ADNI dataset, allowing queries to extract multivariate datasets to perform multidimensional and longitudinal statistical analyses. Moreover, the proposed ontology can be a candidate for supporting the design and implementation of new information systems for the collection and management of AD data and metadata, and for being a reference point for harmonizing or integrating data residing in different sources

    OpenTRIAGE: Entity Linkage for DetailWebpages

    No full text
    We present OpenTriage, a system for extracting structured entities from detail Web pages of several sites and finding linkages between the extracted data. The system builds an integrated knowledge base by leveraging the redundancy of information with an Open Information Extraction approach: it incrementally processes all the available pages while discovering new attributes. It is based on a hybrid human-machine learning technique that targets a desired quality level. After two preliminary tasks, i.e., blocking and extraction, OpenTriage interleaves two integration tasks, i.e., linkage, and matching, while managing the uncertainty by means of very simple questions that are posed to an external oracle

    Noah: Creating data integration pipelines over continuously extracted web data

    No full text
    We present Noah, an ongoing research project aiming at developing a system for semi-automatically creating end-to-end Web data processing pipelines. The pipelines continuously extract and integrate information from multiple sites by leveraging the redundancy of the data published on the Web. The system is based on a novel hybrid human-machine learning approach in which the same type of questions can be interchangeably posed both to human crowd workers and to automatic responders based on machine learning (ML) models. Since the early stages of pipelines, crowd workers are engaged to guarantee the output data quality, and to collect training data, that are then used to progressively train and evaluate automatic responders. The latter are fully deployed into the data processing pipelines to scale the approach and to contain the crowdsourcing costs later. The combination of guaranteed quality and progressive reductions of costs of the pipelines generated by our system can improve the investments and development processes of many applications that build on the availability of such data processing pipelines

    Advanced geo structural survey methods applied to rock mass characterization.

    No full text
    The location and orientation of rock discontinuities, which are traditionally obtained from geological surveys with obvious drawbacks (safety, rock face accessibility, etc.), may also be derived from a detailed and accurate photogrammetric or laser scanning survey. Selecting from the point cloud determined on the rock face a set of points distributed on a particular discontinuity, location, dip, and dip direction can be computed from the least-squares estimate of the plane interpolating the set of points. Likewise, the normal vector to the surface may be computed from an interpolation or approximation of the surface by appropriate functions. To become a real alternative (both in terms of productivity as well as accuracy) to a traditional survey, interactive or automated software tools are necessary, to allow the efficient selection of the point sets on the discontinuities or the interpretation of the normal vector pattern. After introducing the two best technologies available today for data acquisition and their performance, the paper presents an approach, based on the random sample consensus (RANSAC) procedure, to the segmentation of the point cloud into subsets, each made of points measured on a discontinuity plane of the rock face. For each subset, the plane’s equations coefficients are first determined by robust estimation and then refined by least-squares estimation after outlier removal. The segmentation algorithm has been implemented in RockScan, a software tool developed to facilitate the interaction with the point cloud in the identification of the discontinuities; rather than using the three-dimensional (3D) data, selection of regions of interest is performed on oriented images of the rock face. Finally, application of RockScan to four different test sites is discussed and results presented. The sites differ in size (from tens to hundreds of meters), rock surface characteristics, and the technology used to produce the point cloud (in three cases photogrammetry, in the fourth laser scanning), giving the opportunity to test the methodology in different contexts. In the first and in the fourth site an extensive traditional survey has been performed, providing reference data to validate the RockScan results
    corecore